Introduction to Data-Oriented Parsing
نویسندگان
چکیده
We present HPSG–DOP, a method for automatically extracting a Stochas-tic Lexicalized Tree Grammar (SLTG) from a HPSG source grammar and a given corpus. 1 Processing of a SLTG is performed by a specialized fast parser. The approach has been tested on a large English grammar and has been shown to achieve additional performance increase compared to parsing with a highly tuned HPSG parser. Our approach is simple and transparent. The extracted grammars are declaratively represented and have a high degree of practical applicability. Head Driven Phrase Structure Grammar (HPSG) has proven to be a quite successful formalism for specifying natural language grammars in a highly modular and compact manner (Pollard and Sag 1994) supporting the definition of complex linguistic information and interactions between information on different strata systematically using typed feature constraints. On the other hand, inefficiency in processing such grammars is the major obstacle for using the HPSG formalism in practical NL applications (Makino et al. 1998).
منابع مشابه
A memory-based model of syntactic analysis: data-oriented parsing
This paper presents a memory−based model of human syntactic processing: Data−Oriented Parsing. After a brief introduction (section 1), it argues that any account of disambiguation and many other performance phenomena inevitably has an important memory−based component (section 2). It discusses the limitations of probabilistically enhanced competence−grammars, and argues for a more principled mem...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملA Data-Oriented Approach to Semantic Interpretation
In Data-Oriented Parsing (DOP), an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new input sentence is constructed by combining sub-analyses from the corpus in the most probable way. This approach has been succesfully used for syntactic analysis, using corpora with syntactic annotations such as the Penn Treebank. If a corpus with semantically annotat...
متن کاملتأثیر ساختواژهها در تجزیه وابستگی زبان فارسی
Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...
متن کاملAspects Of Pattern-Matching In Data-Oriented Parsing
Data-Oriented Parsing (DOP) ranks mnong the best parsing schemes, pairing state-of-the art parsing accuracy to the psycholinguistic insight that larger clmnks of syntactic structures are relevant grammatical and probabilistic units. Parsing with the DOp-model~ however, seems to involve a lot of CPU cycles and a considerable amomtt of double work, brought on by the concept of multiple derivation...
متن کاملDarwinised Data-Oriented Parsing - Statistical NLP with Added Sex and Death
We present the Darwinised DataOriented Parsing algorithm, an incremental, dy-namic form of Data-Oriented Parsing, in which exemplars are used as replicators, subject to a selection pressure towards gen-eralisability.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003